A quasi‐Monte‐Carlo comparison of parametric and semiparametric regression methods for heavy‐tailed and non‐normal data: an application to healthcare costs
نویسندگان
چکیده
We conduct a quasi-Monte-Carlo comparison of the recent developments in parametric and semiparametric regression methods for healthcare costs, both against each other and against standard practice. The population of English National Health Service hospital in-patient episodes for the financial year 2007-2008 (summed for each patient) is randomly divided into two equally sized subpopulations to form an estimation set and a validation set. Evaluating out-of-sample using the validation set, a conditional density approximation estimator shows considerable promise in forecasting conditional means, performing best for accuracy of forecasting and among the best four for bias and goodness of fit. The best performing model for bias is linear regression with square-root-transformed dependent variables, whereas a generalized linear model with square-root link function and Poisson distribution performs best in terms of goodness of fit. Commonly used models utilizing a log-link are shown to perform badly relative to other models considered in our comparison.
منابع مشابه
Generalized Ridge Regression Estimator in Semiparametric Regression Models
In the context of ridge regression, the estimation of ridge (shrinkage) parameter plays an important role in analyzing data. Many efforts have been put to develop skills and methods of computing shrinkage estimators for different full-parametric ridge regression approaches, using eigenvalues. However, the estimation of shrinkage parameter is neglected for semiparametric regression models. The m...
متن کاملThe Family of Scale-Mixture of Skew-Normal Distributions and Its Application in Bayesian Nonlinear Regression Models
In previous studies on fitting non-linear regression models with the symmetric structure the normality is usually assumed in the analysis of data. This choice may be inappropriate when the distribution of residual terms is asymmetric. Recently, the family of scale-mixture of skew-normal distributions is the main concern of many researchers. This family includes several skewed and heavy-tailed d...
متن کاملModeling heavy-tailed distributions in healthcare utilization by parametric and Bayesian methods
Distributions of healthcare utilization such as hospital length of stay and inpatient cost are generally right skewed. The extremes represent legitimate observations on patients who, because of the severity of their illness and need for medical intervention, have long in-stays and incur large costs. In this context we demonstrate the application of several parametric models for fitting heavy ta...
متن کاملNew Efficient Estimation and Variable Selection Methods for Semiparametric Varying-coefficient Partially Linear Models By
The complexity of semiparametric models poses new challenges to statistical inference and model selection that frequently arise from real applications. In this work, we propose new estimation and variable selection procedures for the semiparametric varying-coefficient partially linear model. We first study quantile regression estimates for the nonparametric varyingcoefficient functions and the ...
متن کاملNew Efficient Estimation and Variable Selection Methods for Semiparametric Varying-coefficient
The complexity of semiparametric models poses new challenges to statistical inference and model selection that frequently arise from real applications. In this work, we propose new estimation and variable selection procedures for the semiparametric varying-coefficient partially linear model. We first study quantile regression estimates for the nonparametric varying-coefficient functions and the...
متن کامل